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(54) Title: A SIGNAL PROCESSING METHOD TO ANALYSE TRANSIENTS OF SPEECH SIGNALS 
(57) Abstract 

The present invention is related to a method and an 
apparatus for determination of a parameter of a system generating 
a signal containing information about the parameter. The method 
comprises the step of short time Laplace transforming the signal 
and may be utilised for classifying the system in question in 
accordance with one or more determined parameters into one class 
of a set of predefined classes defined by predetermined ranges 
of values of the parameters. The invention also relates to the 
use of a shape of energy changes of a signal for identifying or 
representing features of the system generating the signal. This 
use may be applied to recognition of sound features perceivable 
by e.g. a human ear as representing a distinct sound picture. It 
has for example been found that the signal information relevant to 
recognition of speech is present in a transient part of the speech 
signaL 
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A SIGNAL PROCESSING METHOD FOR DETERMINATION OF A PARAMETER OF A 
SYSTEM GENERATING THE SIGNAL 
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The present invention relates to a method for determination of a 
5 parameter of a system generating a signal containing information 
about the parameter. 



The method may be used for identification of sound or speech 
signals, such as-, in speech recognition, or for quality measurement 
10 of audio products or systems, such as loudspeakers, hearing aids, 
telecommunication systems, or for quality measurement of acoustic 
conditions. The method of the present invention may also be used in 
connection with speech compression and decompression in narrow band 
telecommunication . 

15 

The method may also be used in analysis of mechanical vibrations 
generated by a manufactured device during operation e.g. for 
detection of malfunction of the device. 



20 The method may further be used in electrobiology for example for 
analysis of neuroelectrical signals such as analysis of signals 
from an electroencephalograph, an electromyograph, etc. 

BACKGROUND OF THE INVENTION 

25 

The three documents 



HALIJAK C A et al . : "Simple Consequences of the Finite Time Laplace 
Transform Analysis of the Periodically Reversed Switched Capaci- 
30 tors", CIRCUITS, SYSTEMS, AND SIGNAL PROCESSING, 1985, USA, vol. 4, 
no. 4, pages 503-511, XP-002105446, ISSN 0278-081X; 



35 



BARRETT T W: "The Cochlea as Laplace Analyzer for Optimum 
(Elementary) Signals", ACUSTICA, Feb. 1978, WEST GERMANY, vol. 39, 
no. 3, pages 155-172, XP-0021054 45 , ISSN 0001-7884; and 
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HARBOR R D et al . : "THE LAPLACE TRANSFOEIM", ENERGY AND INFORMATION 
TECHNOLOGIES IN THE SOUTHEAST, Columbia, April 9-12, 1989, vol. 1, 
9 April 1989, pages 376-379, XP-00007 6824 , IEEE; 

5 offer relevant background art as regards the Laplace transform. 

Prior art methods of signal processing are based on a short time 
Fourier transform of signals and it is assumed that the signals are 
steady state signals. 

10 

In steady state analysis the signal is assumed stationary in the 
period the signal is analysed and the steady state spectrum is 
calculated . 

15 In real life steady state signals do not occur and steady state" 

analysis does not provide sufficient knowledge of phenomena within 
various scientific and technological fields. Consider for example 
speech analysis. The human ear has the ability to simultaneously 
catch fast sound signals, detect sound frequencies with great 

20 accuracy and differentiate between sound signals in complicated 

sound environments. For instance it is possible to understand what 
a singer is singing in an accompaniment of musical instruments. 

It is assumed that the cochlea in the human ear can be regarded as 
25 comprising a large number of band-pass filters within the frequency 
range of the human ear . 

The time response f(t) for one band-pass filter due to an 
excitation can be separated into two components, the transient 
30 response, ft(t), and the steady state response, fs(t), 

f (t)=ft(t)-hf3(t) . 

Traditional signal processing is based on the steady state response 
fsit), and the transient response ft(t) is assumed to vanish very 
35 fast and to be without importance for the perception, see for 

example "Principles of Circuit Synthesis", McGraw-Hill 1959, Ernest 
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5. Kuh and Donald O. Pederson, page 12, lines 9-15, where it is 
stated that: 

"only the forced response is considered while the response due to 
5 the initial state of the network is ignored". 

Thus, when students are introduced to the world of signal analysis, 
they learn that the transient response, i.e. the response due to 
the initial state of the network should be ignored because it 
10 vanishes within a very short period of time. Furthermore, it is 
rather difficult to analyse these transient signals by use of 
traditional linear methods of analysis. 

The ability of the human ear to hear very short sounds and at the 
15 same time detect frequencies with great accuracy is in conflict 

with the traditional filterbased spectrum analysis. The time window 
(twice the rise time) of a band-pass filter is inversely 
proportional to the bandwidth, tw=2 / ( f u-f i) ^ 

where fi is the lower cut-off frequency and fy is the upper cut-off 
20 frequency. 

Thus, if a rise time of 5 ms is required the consequence is that 
the frequency resolution is no better than 400 Hz. 

25 As the detection of these transients is in conflict with a high 
frequency resolution, the detecting by the human ear of these 
transients must take place in an alternative manner. It has not 
been examined how the human ear is able to detect these signals, 
but it might be possible that the cochlea, when no sounds are 

30 received, is in a position of rest, where the cochlea will be very 
broad-banded. When a sound signal is received, the cochlea may 
start to lock itself to the frequency component or components 
within the signal. Thus, the cochlea may be broad-banded in its 
starting position, but if one or more stable frequencies are 

35 received the cochlea may lock itself to this frequency or these 
frequencies with a high accuracy. 



Today it is known that the nerve pulses launched from the cochlea 
are synchronized to the frequency of a tone if the frequency is 
less than about 1.4 kHz. If the frequency is higher than 1.4 kHz 
5 the pulses are launched randomly and less than once per cycle of 
the frequency. 

Signal processing based on filter bank spectrum analysis is 
disclosed in GB 2 213 623, which describes a system for phoneme 

10 recognition. This system comprises detecting means for detecting 
transient parts of a voice signal, where the principal object of 
the transient detection is the detection of a point where the 
speech spectrum varies most sharply, namely, a peak point. The 
detection of the peak points is used for more precise phoneme 

15 segmentation. The transient analysis of GB 2213623 is based on a 
spectrum analysis and the change in the spectrum, which is very 
much different to the transient analysis of the present invention, 
which is based on a direct transient detection in the time domain. 

20 SUMMARY OF THE INVENTION 

The present invention provides an approach, which is different in 
principle from all known methods for processing signals. The 
approach taken and some of the results obtained will be explained 
25 by of an example in the context of analysis of speech signals. 

Speech is produced by means of short pulses generated by the vocal 
chords in the case of voiced speech and by friction in the vocal 
tract in the case of unvoiced speech. The pulses are filtered by 

30 the vocal tract that acts as a time-varying filter. The output 
response will consist of quasi steady state terms and also 
transient terms. The quasi steady state terms will only be damped 
slightly in the period before the next pulse is generated. The 
transient terms will be sufficiently damped in the time period 

35 before the next pulse is generated. 
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5 

The speech signal is often assumed to have only quasi steady state 
terms in the period or time window of the analysis, typically 20-30 
ms . 

5 The placement of formants, the formants being energy bands in the 
short time power spectrum, are calculated by means of a short time 
spectrum analysis has previously been assumed decisive for speech 
intelligibility, together with voiced/unvoiced detection, the pitch 
and the quasi steady state power. 

10 

However, a number of observations, which has been performed within 
the field of auditory perception research, does not conform to the 
previous assumptions : 

15 Why is it possible to understand and identify a deep male voice 

through communication channels that have a higher cut-off frequency 
than the male pitch. 

The only difference between the pronunciation of the letters: e, b, 
20 d is in the first 1-3 ms of the voice signal and this information 
will be lost if the analysis have a time window of 20-30 ms . 

How can the absolute placement of these formants be decisive when 
their placement is quite different for different people, 
25 particularly between small children and large males. 

Why is distortion dominated by odd order harmonics and caused by 
cross-over distortion in a class B amplifier much more disturbing 
than distortion dominated by even order harmonics caused by 
30 amplitude distortion in a class A amplifier. 

The short time power spectrum will not distinguish frequencies from 
different sources, and tones generated by other sources than the 
speech signal will act like false formants. 

35 
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Why does a signal consisting of three tones with the same 
frequencies as the formants for a vowel not give the slightest 
perception of the vowel at all? The signal just sounds like three 
separate tones . 

Why is the ear very sensitive to frequency changes of a signal up 
till about 1000 Hz, changes of +/- 3 Hz can be detected. For 
frequencies above 1000 Hz, the sensitivity is much smaller. 

The research performed by the present applicant leads to suggest 
that the ear is tone dominant until about 1.4 - 1.6 kHz and 
transient dominant above. Tone dominant means that the pulses 
launched from the hair cells as a response to a tone signal are 
synchronised to the tone signal. Transient dominant means, in the 
present context, that the hair cells are activated by changes of 
the energy with rise and fall times of at most 2 ms typical caused 
by transient pulses. 

Regarding speech signals, it is assumed that the quasi steady state 
20 terms are in the tone dominant interval of the ear and that the 
transient terms are in the transient dominant interval. It is 
believed that the transient terms are very important for speech 
intelligibility. The transient terms are seen as transient pulses 
in the speech signal. The rise time and the shape of leading and 
25 lagging edges of the envelope of transient pulses in the terms of a 
profile of damped frequencies describes the sound picture. The 
shape of the leading and lagging edges, the dynamic changes, change 
of amplitude, of the transient pulses, voiced/unvoiced detection 
and the changes of pitch are decisive for speech recognition. 

30 

This approach provides a number of advantages with respect to 
explaining the earlier mentioned speech perception observations. 

A natural explanation as to why it is possible to understand and 
35 identify a deep male voice through communication channels that have 
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a higher cut-off frequency than the male pitch is provided. The 
pitch can be detected as the period between transient pulses. 

The absolute placement of formants is not decisive. The damped 
5 frequencies profile of the shape of the transient pulse envelope is 
dominated by damped difference frequencies of the transient terms. 

Distortion caused by cross-over distortion in a class B amplifier 
generates abrupt energy changes (unwanted transients) which are 
10 much more disturbing than distortion caused by amplitude distortion 
in a class A amplifier which do not generate the same abrupt energy 
changes. 

iQ 

Robust data- or telecommunication is based on modulation. The 
\P- 15 envelope of transient pulses is a kind of amplitude modulation, 

[!1 transient or impulse response modulation, and will have the same 

O advantages. 

0 It is unlikely that frequencies from other sources will cause 

y 20 interference patterns with the speech signal that gives energy 

3 changes with time constants and shapes in the range that is 

decisive for speech intelligibility. This means that transient 
modulation will be robust in noisy environments and communication 
channels . 

25 

The ear is probably very sensitive to changes of a frequency up 
till about 1000 Hz because the nerve pulses are synchronised to the 
frequency and the period between the pulses is a measure for the 
frequency. In the high frequency range, where the pulses are not 
30 synchronised to the frequency, only placement of the frequency in 
the cochlea is a measure for the frequency. 

According to the invention it has for example been found that the 
signal information relevant to recognition of speech is present in 
35 a transient part of the speech signal. Thus, the method of the 

present invention may involve a separation of the transient part of 
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an auditory signal, a generation of a transient pulse corresponding 
to the transient part, and analysis of the shape of the pulse. In 
an auditory signal, the corresponding transient pulse may be 
repeated with time intervals, and the time interval of these 
5 periodic transient pulses is normally also analysed or determined. 

In real life, the human ear reacts to energy changes at high 
frequencies in order to recognise phonemes or sound pictures. But 
in the present method transient pulses corresponding to the energy 

10 changes observed by the ear are extracted at these high 

frequencies, wherefore the transient pulses preferably are 
transformed to the low frequency range still maintaining the 
distinct features of the sound pictures or phonemes. Thus, by using 
the principles of the invention, it is possible to obtain distinct 

15 features within auditory signals by examining the transformed low 
frequency signals . 

The invention relates to the use of the shape of energy changes of 
a signal for identifying or representing features of the system 
20 generating the signal for example in recognition of sound features 
which can be perceived by an animal ear such as a human ear as 
representing a distinct sound picture are determined. 

The method of the present invention provides an expression for the 
25 transient conditions of the auditory signal. The method comprises a 
band-pass filtration of an auditory signal within the frequency 
range of the human ear and a detection of a low-pass filtered 
envelope, which envelope then can be analysed with known methods of 
signal analysis. The envelope is an expression of the transient 
30 part of the signal. 

The method of signal analysis, which should be used when analysing 
the envelope, and the characteristics of the band-pass filter, 
which should be selected, will depend on the purpose of the 
35 analysis. The purpose may be speech recognition, quality- 



9 



measurement of audio products or acoustic conditions, and narrow 
band telecommunication . 



The invention also relates to a system for processing a signal to 
5 reduce the bandwidth of the signal with substantial retention of 
the information of the signal. The system may further comprise 
means for extracting the transient component of the auditory 
signal, and it may comprise means for detecting an envelope of the 
transient component. 

10 

A signal may be separated into a sum of impulse responses generated 
by poles and zeroes in the system that has generated the signal, if 
?Q the time between the excitation pulses are sufficient long compared 

% to the duration of the impulse responses for the system. 

m 15 

r\ In wo 94/25958 it is shown that the envelope of the transient 

=0 component in a speech signal is very important for its recognition 

and it is shown that the envelope of the impulse response will 
O contain exponential functions and difference frequencies defined by 

20 the impulse response. 

A method based on damped sinus functions to extract 
features from the envelope signal is described, and 
the method is used on speech signals shows that the 
25 important in speech analysis. 



important 
examples where 
features are 



Before entering into a more detailed explanation of features of the 
method of the invention, a few definitions will be given: 



30 In short time analysis the transient component in a signal is a 
matter of definition. For auditory signals, the idea is to obtain 
an expression that gives a response corresponding to the response 
in the cochlea to an abrupt change in the signal energy. An abrupt 
change in the signal energy corresponds to the transient component 

35 in the auditory signal. Thus, in the present context, the term 
"transient component" designates any signal corresponding to an 
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abrupt energy change in an auditory signal. The transient component 
holds the signal information to be analysed and in order to analyse 
this information the transient component may be transformed to a 
corresponding transient pulse having a distinct shape. Thus, in the 
5 present context, the term "transient pulse" refers to a pulse 

having a distinct shape and substantially holding the information 
of the transient component of the auditory signal and thus 
corresponding to an abrupt change in the energy of the auditory 
signal. As mentioned above the transient part of a sound signal may 
10 be repeated with time intervals and thus, in the present context, 
the term "periodic" when used in combination with a transient 
component, response or pulse designates any transient component, 
response or pulse being repeated with intervals. 

15 The term "shape" designates any arbitrary time-varying function 
(which is time-limited or not time-limited) and which, within a 
given time interval Tp has a distinctly different amplitude level 
in comparison with the amplitude level outside the interval. Thus, 
Tp is the duration of the shape function when the shape function is 

20 time-limited, or the duration of the part of the function which has 
a distinctly different amplitude level in comparison with the 
amplitude level outside the time interval. 

In order to extract information from the shape of the energy 
25 changes, one broad aspect of the invention relates to represent the 
shape of the energy changes by the short time Laplace transform of 
a transient pulse of the signal. However, several methods can be 
applied in order to obtain a transient pulse corresponding to the 
change in energy, but it is preferred that an envelope detection is 
30 being used, where the envelope preferably should be detected from a 
transient response of the energy change in the auditory signal. 

The energy change representing the distinct sound picture can be a 
phoneme or vowel or any other sound which gives a sudden energy 
35 change in an auditory signal. 
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It is also an aspect of the invention to provide a method for 
identifying, in an auditory signal, energy changes which can be 
perceived by an animal ear such as a human ear as representing a 
distinct sound picture, the method comprising comparing the shape 
5 of energy changes of the signal with predetermined energy change 
shapes representing distinct sound pictures. For the identification 
it is preferred that the shape of the energy changes are 
represented by the shape of a transient pulse of the signal, and it 
is furthermore preferred that the shape of the transient pulse 
10 should be obtained by an envelope detection of a transient response 
of the energy change in the auditory signal. 

The invention also relates to a method for processing a signal so 
as to reduce the bandwidth of the signal with substantial retention 
15 of the information of the signal, comprising extracting a transient 
part of the signal. The method may further comprise detecting an 
envelope of the transient part of the signal. 

Known methods of processing signals are based on a short time 
20 Fourier transform of signals, and it is assumed that the signals 
are steady state signals. 

In steady state analysis the signal is assumed stable in the period 
the signal is analysed, and the steady state spectrum is 
25 calculated. 

In WO 94/25958 it is disclosed that transient pulses are important 
for speech coding and decoding in narrow band communication, for 
speech recognition and synthesis, and for sound quality in auditory 
30 products (i.e. loudspeakers, amplifiers and hearing aids). 

An important part of a transient signal is the exponential 
functions or damping ratios or time constants. The damping ratio is 
the reason that the impulse response has a finite duration. The 
35 fact that the transient signal is important for auditory perception 
indicates that the response from the hair cells is dependent on the 
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time constants. If this is the case, it is possible that the 
damping ratios in the response from nerve cells in general are 
important for the human nerve system. 

5 Transient signals are also important in many other applications, 
among others signals generated by impacts from defects in rolling 
bearings and gearboxes. 

Based on the transient signal, it is possible to determine the 
10 natural time constants and frequencies in the system generating- the 
signal. Further it is possible to determine the excitation pulses 
of the system. 

% BRIEF DESCRIPTION OF THE DRAWINGS 

shows a time-domain representation of a linear time- 
invariant system, 

shows the impulse response of a Butterworth low-pass 
filter of 3. order and a cut-off frequency at 700 Hz, 

shows the response with the filter relaxed for 
t< 0 and with a 4000 Hz tone as input at />0, 

shows the s-plane with poles and the zero for H{a,Co) , 

shows H{a,CO) for O)^ and <a>2 analysed parallel with the 
<j axis, 

shows transient characteristics in speech signals. 
Figs, 7-12 show processed speech signals. 



Fig. 1 

Fig. 2 

20 

Fig. 3 

25 Fig. 4 

Fig. 5 

30 Fig. 6 



Fig. 13 

35 



shows a 
present 



schematic of 
invention . 



a filter bank according to the 
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DETAILED DESCRIPTION OF THE DRAWINGS 



The importance of the transient part of a signal has been an 
overlooked phenomenon in signal analysis. 



5 



The response of a linear system to either an impulse or a step 
function is defined by its transient response properties. 

The relationship between the input and the output for the linear 
10 time-invariant system shown in Fig. 1 can be written as the 

convolution of the input signal and the impulse response of the 
system: 



If the system is initially relaxed and the input signal v,(/) is 
zero for /< 0 then the lower integration limit of Eq. (1) can be 
replaced with zero. Eq. (1) then shows the important role played by 
the impulse response in terms of the actual signal processing that 
20 is performed by the system. It states that the input signal is 

weighted or multiplied by the impulse response at every instant in 
time and, at any specific point in time, the output is the 
summation or integral of all past weighted inputs. 

25 The impulse response of a real system has a finite duration and the 
transient response has the same duration. Fig. 2 shows the impulse 
response of a Butterworth low-pass filter of 3. order and a cut-off 
frequency at 700 Hz. Fig. 3 shows the response with the filter 
relaxed for t< 0 and with a 4000 Hz tone as input at />0, 



In many processes v/(/) will be a pulse with a short duration and 
v,(/) ^ 0 before the next pulse will be generated. 

The Laplace transform of a signal v(/) is defined by 




(1) 



— aj 



15 



30 
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L(s) = ]vit)e-"dt (2) 

0 

0 

5 

If v(/) is the impulse response h(t) for a system with 2 complex 
poles 

h(t) = e"^"^"^^^^' + ^-(-o->-o)r , / > 0 ( 3 ) 

10 

and 0 forr<0 and s ^ -(cr^ ± ja)^). 
The Laplace transform is 



15 H(s)= 



or 



(cr + cTo + + 6>o ))(^ + ^0 + -/(^ - ^o)) 

20 

From Eq.(4) it is seen that for (a^co) (~aQ,±C0Q) , H{a,co) ^ ±<x> . 

This is a well-known phenomenon and a logical consequence of this 
is as follows : 

25 

If the signal analysed is dominated by the impulse response of the 
system generating the signal, it is possible to determine the 
natural time constants and frequencies for the system. 



30 Fig. 5 shows a plot of H(a,co) for co^co^ and O) =co-, . 
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Analysing a signal along or parallel with the jco axis will give a 
frequency profile for a given a. 

5 Analysing a signal along or parallel with the a axis will give a 
time constant profile for a given jco. 

If a signal has a time constant profile with significant variations 
for specific frequencies, the signal is transient dominated. 
10 Opposite if the signal does not vary significantly for any 
frequency, the signal is steady state dominated. 

A short time Laplace transform is defined by: 

t 

1 5 Lia,o)J) = J vi(t - ( 5 ) 

0 

in which Vi is the signal, L is the transformed signal, a is a time 
constant, and co is an angular frequency. 

20 It is not possible to calculate the short time Laplace transform in 

the same way as DFT in the discrete time domain because two 

arbitrary exponential functions, e"' and e*' , are not orthogonal 
with respect to each other. 

25 The short time Fourier analysis in the analogue time domain is 

based on a filter bank method. In this paper an equivalent method 
will be developed for the Laplace transform. 

From Eq. (1) and Eq. (3) : 

30 

0 
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t 

+ jviit-A)e-^''-""'>^dA (6) 

0 

5 where w*(0 is the complex conjugate of u{t) and we have 

Re[L(cr,6>,0] = K(0 (7) 

From Eq, (6) and Eq. (7) it is seen that filtering the signal v.(0 by 
10 a filter with the impulse response h{a,Ci>J) with 2 complex poles 
will represent the reel part of the short time L(a,COj) transform. 

If we let v.(/) be equal to the impulse response of a single pole we 
have 



15 



0 



{a-a^) + j{(o-co^) 
20 and from Eq. (7) we have 

2k{cr - a^){e~'^ cosjoX) - e^""'' cos(co^t)) 

^ (9a) 

or 

2k ~ {a-a^y- +(^co-co^y- 
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- g"" ((o- - o-Q ) cos(fflr) - (6? - Q>o ) sin(<ar)) ^^^^ 
(o--<To)^ +(<a-6>o)' 

Eq. (9) is not defined for (cr, £») = (<To , ) but from (8) we have in 
this case 

5 

r 

0 

10 and 

v^(/) = 2/:/e~^^ cosCc^o/) (10) 
and we have v^(/)— >0 for / — > oo . 

15 

Eq.(9) shows that the gain is inversely related to C-Cq and 
CO-co^^, and when (a^^ty^) is far from (cr^co) and e''^ -e''^'^ is small, 
v^(/)=^0. For (o-,,,^^) ^ (cr,(y) v^(/) will have Eq.{10) as the limit'. It 
is not immediately to see if Eq. (9) has the maximum energy for 
20 (a,,,co,,)^(a,co) , 

In the DC domain Eq.(9) can be written as 



25 v^{r) = 2k- (11) 



The maximum for v^(/) can be found as follows 



18 

when 



^ _ log(o-)-log(o-o) ^^2) 



and Eq. (11) will have the maximum for this value. 

5 

It can be shown that — > when a ^ , 



When cr « we will have the approximated maximum with i — 



10 yo(i-) = ^f^- (1^) 



From Eq.{13) it can be shown that 

for CT^cr^ 

In Eq. (11) e"""^' represent the signal to be analysed and e'"^ the 
filter. Table 1 shows the result with a filter having 
a = 100 s'" and the signal varying from 1 to 10000 s"^ 

20 It is not surprising that the convolution acts as a low-pass. 

filter. The important fact is that the exponential function in the 
DC domain in some way acts as frequencies do in the frequency 
domain . 

25 In table 1 is the result of a convolution where the signal 

is differentiated. The result is, as expected, a high-pass filter. 



If we look on Eq . (9a) without exponential functions it can be 
written as 

30 
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Vo(0 = 



2A:(sin(fflr) - sin{o)^t)) 



(14) 



it is seen that for o — > cx) we will have v„ — > 0 . 



cr : 


100 s"' 














s'' 


s 






1 


0, 046516871 


0, 954548457 


0, 009545485 


10 


0, 025584279 


0, 774263683 


0, 077426368 


100 


0, 010000000 


0, 367879441 


0, 367879441 


1000 


0, 002558428 


0, 077426368 


0, 774263683 


10000 


0, 000465169 


0, 009545485 


0, 954548457 



Table 1 v^{t^)is given by Eq.(ll, 12) and normalised 
by cr and 2k. ^^1(^^)13 a convolution where the 
signal is differentiated and normalised by 2k. 



10 For CO«CO^ we will have 



2k{s\n{co() - s\n{a>^t)) 



15) 



It can be shown that for CO ^ COr, we will have 



15 



(16) 



This result is as expected unstable. 

20 In transient analysis only the beginning of the signal is of 

interest, and if co^^ » I Eq.(14) will act as a band-pass filter. 

Speech processing is based on fast energy pulse generated by the 
vocal cords or by friction in the articulation channel weighted by 



• 
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the impulse response in the articulation channel. The rise time for 
the excitation pulses has to be sufficient faster than the rise 
time of the energy of the impulse response. 

5 The shape of energy pulses are important features in speech. If the 
time between the pulses is periodical it is voiced speech, and if 
not it is unvoiced speech. For some phonemes abrupt changes in the 
energy pulses are important. 

10 From WO 94/25958 it is known that the shape of the energy pulses 
are important for speech recognition, especially the leading edge. 
In the following a method to extract features will be developed 
based on an envelope detection. 

15 The convolution expressed in Eq.(9) can be regarded as a response 
from 2 poles in the articulation channel excited by an impulse. If 
a^'^O' we have from Eq. (9a) 




-at 
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The envelope is defined as 



25 where 



is the Hilbert Transform. 



30 The envelope of' Eq. (17) is then 



(/) = T-^ rJ{s\n{cot) - smico^t))- +(-cos(<yr) + cos(<yoO)' 



• % 
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1^2(1 -cos(<y-<yo)0 



= 1 r(l-^cos((<y-tyo)0) (18) 

5 

The approximation is acceptable because 

As expected the envelope has a component with the difference 
frequency of the 2 frequencies. 

10 

The conclusion is that we can expect to find damped difference 
frequencies in the envelope of the transient component. 

To detect the damped difference frequencies a filter bank is used. 
15 The features might be detected as a convolution between the 
transient pulse and the impulse response of the filters. 

In general form the impulse response can be written as 
20 /7(0 = ke-"" sin(/(/l)/ + <p) 

Where a-X and (0 = f{X), 

In the following analysis f {X) - \5X , k - CO = \5Z , and (p=0 are 
25 selected and we have 

h{t) = l5Ae~^ sinilSAt) (19) 

By selecting co = \.5a Eq.(19) will act as a band-pass filter with a 
30 low Q in relation to the frequencies. Other ratios co/a than 1.5 may 
be selected and it is presently preferred that the ratio (co/a) 
ranges from 0.5 to 2.5. The exponential function gives the advance 
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that it acts like natural time window that ensure that the signal 
is natural damped. The value of the parameters are selected by 
studying rise times in important transient pulses and by 
experiments . 

5 

Fig. 6 shows transient characteristics in speech signals. The top 
figure shows 50 ms of an ""a" in ""hard key" pronounced by a female. 

The second signal is a band-pass filtration of the speech signal. 
10 The band-pass filter is a Butterworth filter with 6 poles and a 
band width from 2150 to 3550 Hz. This frequency band contains 
important transient pulses in the sensitive frequency interval of 
the ear. 



15 The third signal is a energy detection of the transient 

characteristics of the band-pass filtered speech signal. The 
detection is an envelope detection performed by means of a 
rectification and a low-pass filtration of the signal. The filter 
is a Butterworth filter with 3 poles and a cut-off frequency at 700 

20 Hz. 



In WO 97/09712 a method for automatically detecting the leading 
edges is disclosed. The method uses the maximum slope of the 
leading edge as reference, and the point before the maximum slope 
25 where the slope is less than a given threshold (10-20 % of the 
maximum slope) the leading edge is defined to begin. 

The transient (envelope) signal in Fig. (6) has a DC component, 
which does not contain any information. Therefore it is preferred 
30 that the signal is differentiated before it is analysed e.g. by the 
filter bank shown in Fig. 13. 

In Fig. 13, the filters (hi(t), h2(t),..., hn(t)) in the filter bank 
connected between the input and the envelope detectors are band- 
35 pass filters having bandwidths corresponding to the bandwidths of 
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the band-pass filters of the cochlea and having centre frequencies 
ranging from 1400 Hz to 6500 Hz. 

The output signals Oij (p) from the filter bank shown in Fig. 13 is 
5 calculated by: 

h,jip) = LSA^e-'-'siniA^p) . i=0, 1, ...,N-l 

j=0, 1,...,M-1 



10 ^(P)=0, P < 0 

^0 (^) = lLnk)h„^p-k) . p=o, 1, p-1 

m=0, 1, M-1 and M is the number of band-pass filters with a low Q 
in the filter bank connected between the outputs and the envelope 
15 detectors, p = 0,1,..., P-1 is the sample number, t' is the 
differentiated transient signal, and is the filter bank 

parameter and it is normalised by the sampling frequency. 

In the analysis M is selected to 10 and 1500 < A'^ < 12000 s'\ X\ is 
20 not normalised. By this we have 1885 < O:;^ < 18850 s'^ or 
300</„, <3000 Hz. 



This filtering process is not done in the cochlea but in the hair 
cells or in the nerve system behind the hair cells. 

25 

The Figs. 7, 8, 9, 10, 11, and 12 show the output of the processing 
of transient signals in the vowels ^^a", "o'' , " i" in ^'hard key" and 
''soft key" pronounced by a female and a male. Further the figures 
show plots of maxima of the output signals as a function of the 
30 time constant a of the corresponding filter. 



The figures show that maximum curves are very much alike for the 
same vowels, independent of whether a female or male pronounces it. 
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With a library of templates and a distance measure it is possible 
to identify the sound picture, and it can be used for speech 
recognition and narrow band communication. 



Thus, according to the invention a method and an apparatus are 
provided for determination of a parameter of a system generating a 
signal containing information about the parameter, in which the 
signal is short time transformed substantially in accordance with 



0 



in which Vi is the signal, L is the transformed signal, a is a time 
constant, co is an angular frequency, and cp is a phase, or, in 
accordance with another transformation which will give rise to an 
L' (a,CD,t) which in time intervals within which L(a,co,t) is larger 
than 10% of its maximum value is not more than 50% different from 
the result given by the short time Laplace transformation. 

In narrow band communication the transient pulses have to be 
identified and coded, and the decoder will contain a library of 
filters with corresponding transient responses. The decoder library 
could also contain the transient responses. 



The present invention also relates to measurement of mechanical 
vibrations e.g. when testing devices that generate mechanical 
energy during operation, such as mechanical devices with moving 
parts, such as compressors for refrigerators, electric motors, 
household machines, electric razors, combustion engines, etc, etc. 



For example, it is known that measurement of vibration generated or 
sound emitted by a device during operation can be useful for 
detection of malfunction of the device. Certain failures may 
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generate sound or vibration of specific characteristics that can be 
recognised . 

The method may also comprise steps of classification for 
5 classifying a tested device in accordance with the determined 
parameters into one class of a set of predefined classes. Each 
predefined class may be defined by a set of upper and lower limits 
for specific parameters determined according to the method. A 
device may then be classified as belonging to a certain class if 
10 its corresponding parameter values lie within corresponding upper 
and lower limits of the class. 

Each class may correspond to a specific type of failure of the 
device. For example, shaft imbalance, wheel' imbalance, crookedness, 

15 - imperfections of teeth in cogs, tight bearing, loose bearings, etc, 
may cause the device to vibrate in different characteristic ways, 
whereby a characteristic mechanical vibration or sound is generated 
for each type of failure. The type of failure of the device may 
then be detected by comparing determined device parameters with 

20 corresponding parameter values of various predetermined classes. 

The upper and lower limits of a specific class of devices may be 
determined by testing a set of devices known to belong to that 
class. For example, the upper limits may be determined as the 
25 average of specific parameter values plus three times the standard 
deviation. Likewise, the lower limits may be determined as the 
average of parameter values minus three times the standard 
deviation . 
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CLAIMS 

1. A method for determination of a parameter of a system generating 
a signal containing information about the parameter, comprising the 
5 step of short time transforming the signal substantially in 
accordance with 



0 

in which vi is the signal, L is the transformed signal, a is a time 
10 constant, co is an angular frequency, and q) is a phase. 

2. A method according to claim 1, wherein the step of transforming 
comprises filtering the signal Vi with a filter having a pole at a 
+ jcot and a pole at a - jcot. 

15 

3. A method according to claim 1. or 2, comprising steps of 
transforming the signal Vi for a plurality of sets of a and co 
values . 

20 4 . A method according to any of - the preceding claims, further . 
comprising the step of determining a maximum of at least one 
transformed signal L(a,CD,t). 

5. A method according to any of the preceding claims, further 
25 comprising the step of comparing transformed signals L with 

corresponding reference signals in order to determine parameters of 
the system. 

6. A method according to any of the preceding claims, further 

30 comprising a step of pre-processing the signal before the step of 
short time transforming, the pre-processing being selected from the 
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group consisting of filtering, rectification, differentiation, 
integration, and amplification. 

7. A method of transmitting a signal containing information of a 
5 set of parameters of a system generating the signal, comprising 
processing the signal according to any of the preceding claims and 
further comprising the step of transmitting the determined 
parameter values. 

10 8 . A method according to claim 7 further comprising the step of 
generating a copy of the signal from the transmitted parameter 



9. A method of transmitting a signal containing information of a 
15 set of parameters of a system generating the signal, comprising 

processing the signal according to any of the preceding claims and 
further comprising the steps of 

comparing the signal with a library of signals generated for a 
20 predetermined set of parameter values by the system, 

selecting the library function that constitutes the best match to 
the signal, and 

25 transmitting an identification signal that identifies the matching 
library function. 

10. A method according to claim 9, further comprising the steps of 
receiving the identification signal and generating the 

30 corresponding library signal. 

11. A method of classifying a system according to one or more 
parameters of the system generating a signal containing information 
about the one or more parameters, comprising determining the one or 

35 more parameters according to any of claims 1-6 and further 

comprising the step of classifying the system in accordance with 



values . 




m 
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the one or more determined parameters into one class of a set of 
predefined classes defined by predetermined ranges of values of the 
parameters . 

5 12. A method for communicating an auditory signal, comprising 

processing the signal by the method according to any of claims 1-6, 
transmitting the processed signal, and receiving the processed 
signal by a receiver. 

10 13. A method according to claim 12, wherein, prior to transmission 
of the processed signal, the signal is coded into a digital 
representation, and the coded signal is decoded in the receiver so 
as to reestablish transient pulse shapes perceived by an animal ear 
such as a human ear as representing the distinct sound pictures of 

15 the auditory signal. 

14. A method according to claim 13, wherein the digital 
transmission is performed at a bandwidth-of at the most 4000 bits 
per second. 



15. A method according to claim 14, wherein the bandwidth is at the 
most 2000 bits per second. 

16. A method according to claim 15, wherein the bandwidth is in the 
25 interval of 800-2000 bits per second. 

17. A method according to any of claims 13-16, - wherein a second and 
further pulses in a sequence of identical pulses are represented by 
a digital value indicating repetition. 



18. A method according to any of claims 1-6, comprising filtering 
the signal Vi in a filter bank comprising a plurality of band-pass 
filters interconnected in parallel with centre frequencies ranging 
from 1400 Hz to 6500 Hz, each of which is connected in series with 
35 an envelope detector and a filter bank comprising a plurality of 
low-pass filters interconnected in parallel and having cut-off 



20 



30 
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frequencies ranging from 300 Hz to 3000 Hz and time constants a 
ranging from 1500 s"^ to 12000 s"^ . 

19. An apparatus for determination of a parameter of a system 
5 generating a signal containing information about the parameter, 
comprising a processor that is adapted to short time transform the 
signal substantially in accordance with 



10 in which vi is the signal, L is the transformed signal, a is a time 
constant, co is an angular frequency, and (p is a phase. 

20. An apparatus according to claim 19, wherein the processor 
comprises a filter for filtering the signal Vi and having a pole at 

15 a + jot and a pole at a ~ jcot. 

21. An apparatus according to claim 19 or 20, -wherein the processor 
comprises a plurality of filters for filtering the signal vi, each 
filter having a different set of a and co values. 

20 

22. An apparatus according to claim 19, wherein the apparatus 
comprises a communication channel transmitter, and the processor is 
adapted to determine the one or several parameters of the system, 
and 

25 

to transmit the one or several system parameters over a wireless or 
a cable communication channel. 
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